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1 .  INTRODUCTION 


In  the  literature  there  appear  to  be  two  different  approaches  to  the 
derivation  of  the  least  squares  lattice  algorithms:  an  algebraic  and  a 
geometric  approach.  The  first  method  chronologically  precedes  the  former  and 
was  originally  presented  by  Morf,  Lee,  et  al.  (ref  1).  See  ref  2  for  a 
clear  and  precise  development.  The  geometrical  approach  (ref  3,  4)  gives  more 
insight  and  is  much  more  powerful  in  the  sense  that  it  encompasses  an  entire 
class  of  computationally  efficient  algorithms,  (ref  4). 

The  derivation  presented  in  this  report  is  different  than  the  geometrical 
approach  in  that  the  filter  coefficients  are  solved  for  directly  by  Gram- 
Schmidt  orthogonalization  of  the  data  (ref  5,  6).  The  advantage  of  this 
approach  is  to  provide  insight  into  the  numerical  conditioning  of  the 
algorithm,  which  is  shown  to  be  equivalent  to  the  numerical  conditioning  of 
solving  the  so-called  normal  equation  associated  with  least  squares.  Although 
only  the  unnormalized  least  squares  lattice  is  derived  explicitly,  the 
normalized  version  is  briefly  discussed  and  is  easily  shown  to  have  the  same 
conditioning  as  the  unnormalized  version.  A  simple  numerical  example  is  given 
to  illustrate  this  point.  Thus  contrary  to  popular  belief,  the  normalized 
least  squares  lattice  is  not  numerically  superior  to  the  unnormalized  algorithm 


for  ill-conditioned  problems. 


2.  DEFINITIONS 


x(t)  and  i/(t)  shall  represent  two,  complex,  scalar  discrete  time  series 
The  pre-window  case  will  be  assumed,  i.e., 

x(t)  =  y(t)  =  0  t  <  0  (1) 

Let  y( T)  be  a  T  by  1  column  vector  whose  t1*1  component  is  given  by 


[  y( T)]t  =  y( t) 


t  =  0,1, ---T-l 


Also,  let  xX(T)  be  a  T  by  1  column  vector  given  by 


[  3C1  (T)  ]  t  = 


x(t-i) 


i  <  t  <  T-l 


0  <  t  <  i 


where  i  is  an  integer. 

For  any  two  vectors,  u(T),  v(T),  of  dimension  T,  we  define  the  inner 
product 


<u(T) ,v(T)>  =  1  (u(T)]  [v(T) ] 

t=0  L 


This  is  the  familiar  inner  product  of  complex  vectors.  We  also  have  the 
Euclidian  norm 


l|u(T)ir  =  <u(T),  u(T)> 


The  following  T  by  T  diagonal  matrix  is  needed 


\  1  0 
,T-2 


A(T)  = 


\ 

0  1 


2 


(7) 


where  0  <  X  <  1.  Let  A*(T)  be  such  that 
A*(T)A*(T)  =  A(T) 


Then  it  is  quite  obvious  that  A^(T)  is  the  diagonal  matrix 


A^(T)  = 


x(T-l)/2  0 

\ (T— 2 ) / 2 


A* 

0  1 


We  will  need  the  following  two  properties  of  A*(T) 


A*(T+1)  = 


A1/2(T) 


A^A^(T)  j  0 

- r - 


Let  the  T  by  1  column  vectors  xx(T)  and  y(T)  be 


xA(t)  =  A^(T)  Z*(T) 


(10a) 


y(T)  =  a^(t)  </( T) 


(10b) 


It  should  be  clear  from  the  definitions  that 


[x1(T))t  =  x(T-l) 


(11a) 


[y1(T)]T  =  y( T-i) 


(lib) 


These  properties  will  be  useful  later.  We  also  define  the  T  by  N  matrix 
0^ (T)  as 

^(T)  =  (x°(T)  x1(T)-'-xN‘1(T)) 


(12) 


3 .  PROBLEM  STATEMENT 


We  seek  the  N  by  1  column  vector  s  that  minimizes 
squares 


the  weighted  sum  of 


‘  o)' 

‘  X(0) 

0 

x(0) 

...  o 
...  o 
•••  x(0) 

_  y(T)  _ 

_  x  (T) 

X(T-1) 

• 

x (T-N+l)  _ 

2 


A(T+1 ) 


(13) 


The  weight  matrix  A(T+1),  for  K  <  1,  allows  the  filter  to  be  adaptive  by 

~N  “N 

"forgetting"  old  data.  Let  h  (T)  be  the  s  that  minimizes  the  above 
expression  and  denote  the  minimum  by  J^(T).  Thus,  using  the  definitions  of 
Section  2,  we  can  write  the  least  squares  problem  as 


2  2 

Jn(T)  =  min  | |y(T+l)  -  0N(T+l)sN||  =  ||y(T+l)  -  0N(T+1 )hN(T) | |  (14) 


-N 


For  many  applications,  such  as  channel  equalization  with  a  known 

N-1  _N 

sequence,  the  filtered  output  2  x(T-i)[h  (T)].  is  of  interest.  However 

i=0  1 

for  other  applications,  such  as  noise  cancellation  or  decision-directed 

N'1  -N 

equalization,  the  filtered  output  2  x (T-i) [h" (T- 1 ) ] .  is  required  (ref  8,  9). 

i=0  1 

Therefore,  we  will  define  the  two  error  residuals 


eN(T)  =  j/(T)  -  (  x  (T)  •  •  •  x  (T-N+l )  ]hN(T) 


e^(T)  =  t/(T)  -  (  x (T)  •  •  •  x(T-N+l)]hN(T-l) 


(15a) 

(15b) 


A  computationally  efficient  algorithm  shall  be  derived  for  the  time  and 

order  updates  of  e„(T)  and  e'(T).  The  algorithm  is  efficient  in  the  sense 
N  N 

that  it  calculates  e„(T)  and  eL(T)  for  all  orders  N=0,1,2,***N  with  only  on 

N  N  O' 

the  order  of  Nq  calculations  per  time  update.  However,  before  equation  (14) 


can  be  solved,  we  must  solve  the  auxilary  problem  of  the  weighted  least 
squares  one-step  forward  and  backward  predictor  filter,  the  so-called  least 
squares  lattice  algorithm. 


VV.V'.' 


-  .*•  -  *  . 


-/  V 


4.  DERIVATION  OF  THE  LEAST  SQUARES  LATTICE  ALGORITHM 


First,  the  general  solution  to  the  forward  and  backward  filtering 
problem,  along  with  some  useful  properties,  shall  be  presented  in  sections 
4. A.  and  4.B.  Order  updates  are  derived  in  4.C.  and  4.D. ,  followed  by  the 
time  updates  in  4.E.  The  least  squares  lattice  algorithms  are  summarized 
in  5. 

4. A.  THE  ONE-STEP  FORWARD  PREDICTOR 

The  predictor  filter  is  found  by  minimizing: 


X(0)' 

* 

•  / - V  O 

o 

0 

x(T) 

x (T-l) 

m 

x  (0) 

X (T-N) 

(16) 


A(T+1 ) 


-N 


with  respect  to  s  .  By  property  (9a),  the  above  norm  is  equivalent  to 


2  2 

I  Jjt_1(T)  -  0N(T)sN|  |  +  AT|  x  (0)  ( 


(17) 


_N  £ 

Let  the  s  that  achieves  this  minimum  be  f  (T),  with  minimum  norm  J^(T).  Thus 


2  2 

jJ(T)  =  I | x"1 (T)  -  0N(T)fN(T) | |  +  \T|  x  (0) | 


(18) 


The  minimization  of  (17)  can  easily  be  solved  by  finding  a  T  by  T  unitary 
matrix  Q(T),  such  that 


Q^(T)0N(T)  = 


N  { 
r-N  | 


WN(T)' 

o 


(19) 


where  w  (T)  is  a  N  by  N,  nonsingular,  upper  triangular  matrix  (see  ref  5  and 
6).  The  dagger,  "t",  represents  the  complex  conjugate  transpose.  Q(T) 
exists,  provided  0^(T)  is  full  rank  for  N  <  T.  Thus,  {x°(T)****x^  *(T)}  must 
be  linearly  independent  set  of  vectors. 


To  see  how  (19)  solves  (17),  we  make  use  of  the  unitary  property  of  Q(T), 
i.e.,  Qf(T)  Q(?)  =  Q(T)  Qf(T)  =  I,  to  write  (17)  as 

2  2 

|  |Qt(T)x“1(T)  -  Qt(T)0N(T)sN|(  +A.T|x(0)(  (20) 

Using  (19),  we  can  split  the  above  norm  into 


2  2  2 
l|x"J’(T)  -  WH(T)sM|  |  +  ||x'*"(T)||  +  XT|  x  (0)  | 

where, 

N 

Qt(T)x"1(T)  = 

T-N 


(21) 


(22) 


Minimization  of  (21)  is  now  simple,  with  f^(T)  and  jj|j(T)  given  by 

WN(T)fN(T)  =  x"^'(T)  (23a) 

it  2  2 

JJ(T)  =  lirJ"(T)||  +  AT|  x  (0)  |  (23b) 

Let  {q°(T)‘,*,q^  *(T)}  be  found  by  a  Gram-Schmidt  orthogonalization  of 
{x°(T)*“X^  ^(T)}  ;  then  Q(T)  is  given  by 

Q(T)  =  (q°(T) . qT_1(T) )  (24) 


7 


This  can  be  seen  as  follows.  The  Gram-Schmidt  procedure  is  given  by 


_o.T.  _  x°(T) 

q  (T)  -  - - - 2 

l|x°(T)ir 


(25a) 


For  k  =  1 ,  T- 1 ,  do 


qk(T)  =  xk(T)  -  I  <qj(T),  xk(T)>  qj (T) 


(25b) 


-fk(T)  = 

I lqk(T) | | 


(25c) 


End  loop. 

This  procedure  produces  a  set  of  orthonormal  vectors  (q°(T) • • -q^(T) }  that 
span  the  same  space  as  (x°(T)  *  •  •'x^(T) }  for  N  =  0,***T-1.  Denote  this  space  by 


SN(T)  =  {q° (T) * • *q (T) }  =  {x°(T) • *  *x  (T) } 


Also,  we  have  the  additional  property 


<qX(T),  xJ(T)>  =  0 


j  <  i 


Thus,  from  (27)  and  (12),  we  have  (19)  where  W^(T)  is  given  by 

[WN(T)].j  =  <qX(T),  7j(T)>  (2E 

/(T)  is  upper  triangular  because  of  (27).  Also,  Q(T)  is  unitary  by  the 
orthonormal  property  of  {q°(T),’,'q^  *(T)}. 


It  will  also  be  necessary  to  consider  the  vector 


8 


/(T)  =  0N(T)fn(T) 


F^(T)  can  be  interpreted  as  the  projection  of  x  *(T)  onto  *(T) 
easily  seen  as  follows: 


From  (19), 

Q^CDl^d)  =  Qt(T)0N(T)fN(T)  = 


W^T) 

0 


fn(T) 


wN(T)fN(T) 

0 


From  (23a),  we  write  (30)  as 


Qt(T)FN(T)  = 


r  -i ' 

?  i (T) 


Since  Q(T)  is  unitary,  multiply  (31)  on  the  left  by  Q(T)  to  obtain 


fV)  =  Q(T) 


rl'  <T> 


__  i 1 

From  the  definition  of  Q(T)  and  x  ^  (T),  equations  (24)  and  (22),  we 


FN(T)  =  [q°(T)----qT_1(T)] 


<q°(T),x'1(T)> 


<qN‘1(T),x'1(T)> 


(29) 

This  is 

(30) 

(31) 

(32) 

have 

(33) 


This  is  equivalent  to 


FN(T)  =  I  <q1(T),  x” 1 (T)>  q1(T) 


which  was  to  be  proven.  We  will  also  need  to  define  the  forward  error  residual 


eJ(T)  =  x(T)  -  [  x(T-l) • • •  x(T-N) ] fN(T) 


Notice  that  by  (29),  (11a),  and  (12), 


[FN(T)]t  =  [  x(T-l) • • •  x(T-N)]fN(T) 


=  x(T)  -  cJ(T) 


4.B.  THE  ONE-STEP  BACKWARD  PREDICTOR 


It  will  also  be  necessary  to  solve  the  least  squares  problem; 


x(T-N) 


x(T-N+l) 


A(T+1) 


or  equivalently 

2 

I |xN(T+l )  -  6N(T+1)sN||  (38) 

Let  b^(T)  denote  the  "s^  that  minimizes  (38)  and  let  J^(T)  be  the  minimized 
norm.  Thus, 

2  2 

J„(T)  =  min  | |xN(T+1 )  -  0N(T+l)sN||  =  | |xN(T+l)  -  0N(T+l)bN(T) I  I  (39) 

_N 

s 

b^(T)  can  be  interpreted  as  the  one-step  backward  predictor  filter.  (38)  is 
minimized  just  as  in  the  forward  predictor  problem.  Therefore,  we  have 


10 


WN(T+l)bN(T)  = 


J^(T) 


N  1 

x{j  (T+l) 


iixjj"(T+i)n 


whe  re , 


Qt(T+l)6N(T+l) 


N  |  xj' 

l-N  I  xj" 


(T+l) 


(T+l) 


The  following  vector  will  be  needed 


BN(T)  =  xN(T+1)  -  6N(T+l)bN(T) 


Just  as  was  done  in  equations  (29)  -  (34),  it  is  possible  to  show  that 


N- 1 

0N(T+l)bN(T)  =  l  <qX(T+l),  xN(T+l)>  q^T+l) 
i=0 


Thus , 


BN(T)  =  xN(T+l)  -  1  <qa (T+l ) ,  xN(T+1)>  q^T+l) 


From  the  Gram-Schmidt  procedure,  (25),  we  immediately  have 


BN(T)  =  q^  (T+l ) 


Thus,  from  (39) 


2  2 

J$(T)  =  I  |BN(T)  |  |  =  llq^T+nil 


Another  useful  expression  for  JN(T)  is 


jJj(T)  =  <?(T+1),  xN(T+1)> 


This  follows  easily  from  the  Gram-Schmidt  procedure,  where  we  have 


2  j  2 

<qN(T),  xN(T)>  =  | |xN(T)| |  -  1  |<qi(T) ,  xN(T)> | 

i=0 

T- 1  2 

=  1  I <ql(T) ,  xN(T)> | 

i=N 

However,  from  (41) 

,  2  T-l  2 

I|xT(T+1)||  =  I  |<qi(T'),  xN(T)>I 

i=N 

Thus,  (47)  follows  from  (40b),  (49),  and  (48). 

We  define  the  backward  error  residual 

ejj(T)  =  x(T-N)  -  [  x (T) - x (T-N+l) ]bN(T) 

Notice  that  from  (45),  (42),  (11a),  and  (12),  we  have 

I?(t+D]t+1  =  [bn(t)]t+1 

=  x(T-N)  -  1  x (T) -  x (T-N+l )]bN(T) 

=  4<t) 

For  time  updates,  we  will  need  the  residual 

eJj'(T)  =  x(T-N)  -  (  x  (T) - x  (T-N+l)]bN(T-l ) 

4.C.  ORDER  UPDATES  FOR  FORWARD  PREDICTOR 

From  (23a),  the  N+l-order  forward  predictor  is  given  by 

WN+1(T)fN+1(T)  =  x’i'  (T) 


(48) 


(49) 


(50) 


(51) 


(52) 


(53) 


12 


From  (28),  WN+1(T)  can  be  partitioned  as 


WN+1(T)  = 


-K-l  { 


vAt) 


<q°(T),xN(T)> 


<qK_1(T),xN(T)> 


0  <qN(T),xN(T)> 


(54) 


-N' 


and  from  the  definition  of  x^  (T),  (41),  the  above  is  equivalent  to 


WN+1(T)  = 


vAt)  ' 


^’(T) 


0  •  •  •  0  <qN(T),xN(T)> 


(55) 


_-l’ 


Also,  it  is  clear  from  (22)  that  we  can  partition  xN+1(T)  as 


WT)  = 


XN  (T) 


<qN(T),x_1(T)> 


(56) 


*N+1, 


To  derive  order  updates,  partition  f  (T)  as 


fN+1(T)  = 


fjJ+1(T) 


f^+1(T) 
rN+r  ; 


(57) 


“"N+ 1  — N-^  1 

where  fjj  (T)  is  a  vector  representing  the  first  N  components  of  f  (T)  and 


-N+l 


fN+j(T)  the  last  component.  Substituting  (55),  (56),  and  (57)  into  (53)  leads 
to  one  vector  equation  and  one  scalar  equation  to  be  solved. 


(58a) 


Wl,(T)fJJ+1(T)  +  fJJ*}(T)  x|j  (T)  =  xjj1  (T) 


fN+l (T^  <  qN(T).xN(T)>  =  <qN(T),x‘1(T)> 


Since  W^(T)  is  invertible,  multiplying  (58a)  on  the  left  by  W^(T) 
using  (23a)  and  (40a)  gives 


fJJ+1(T)  =  fN(T)  -  fJj*}(T)bN(T-l) 


rN+l . 


fjJ+l(T)  is  found  from  (58b) 


fN+l(T)  =  <qN(T),x~1(T)>  =  <qN(T),x'1(T)> 

_U  _M 

<qN(T),xN(T)>  <qN(T),xN(T)> 


_  kN(T) 


Jb(T-l) 


where  we  have  made  use  of  (25c),  (47),  and  the  definition 


kN(T)  =  <^(1),  x_1(T)> 


Equations  (59)  and  (60)  can  be  combined  into 


fN+1(T)  = 


fN  (T) 

4. 

kN(T) 

-bN(T-l) 

0 

i 

jjj(T-l) 

1 

Order  updates  for  (T)  follow  easily  from  (23b)  with  N  replaced 


jJL(T)  =  x“i"(T)  2  +  \T|  x(0)|2 


(63) 


(58b) 

*  and 


(59) 


(60) 


(61) 


(62) 


by  N+l 


and  from  observing  that  by  (22)  (T)  can  be  partitioned  as 


-1" 

V  (T) 


<q^ (T) ,  x"1 (T)> 


xn+i(t) 


Thus, 


llx^'cnil2  =  |<qN(T),y"1(T)>|2  +  I  l*N+l(T)  I  1 2 


llq*(T)ll2 


i  Oi  i2 


where  we  have  used  (25c),  (46),  and  (61).  Therefore,  from  (63)  and  (65)  it 


follows  that 


jJ+1(T)  =  I  l*jj*"(T)  1 1 2  +  XT|  x  (0)  |2 


4m  -  4^ 

N  Jj(T-l) 


|kN(T) | 2 
jJ(T-l) 


Order  updates  for  the  forward  error  residual  follow  by  using  (62)  in  the 
defining  equation  (35)  with  N  replaced  by  N  +  1,  and  are  given  by 


eJ(T)  "  T~~  £n(t_1) 
N  jJ(T-l)  N 


(67) 


«W 


4.D.  ORDER  UPDATES  FOR  BACKWARD  PREDICTOR 

In  order  to  derive  order  updates  for  the  backward  predictor,  we  must 
start  with  the  least  squares  problem  (38),  with  N  replaced  by  N  +  1 .  Thus, 
we  have  the  least  squares  problem 

| |xN+1 (T+l)  -  0N+1(T+l)iN+1ll2  (68) 


To  derive  order  updates,  partition  as 


where  is  the  first  component  of  and  is  an  N-dimensional  column 

_N+i 

vector  consisting  of  the  last  N  components  of  s  .  From  (9a),  (10a),  and 
(12),  a  little  thought  will  show  that  we  can  make  the  following  partitioning: 


xN+1(T+l)  = 


0N+1(T+1)  = 


xN(T) 


AT/2x(0) 


x"1 (T) 


0N(T) 


Substituting  (69),  (70),  and  (71)  into  (68)  gives  the  equivalent  least  squares 


problem 


r o  i  r^2«(o)io . . . oi  m 


The  above  norm  can  be  minimized  with  the  help  of  the  following  T+l-by-T+1 
unitary  matrix 


Q(T) 


(73) 


Thus,  as  was  done  in  subsection  4A,  the  above  unitary  transformation  of  (72) 
yields  the  equivalent  form 

•  l 

2 


0 

SJ'(T) 

- 

.  S2"(I)  . 

L 

(74) 


where  we  have  used  (19),  (22),  and  (41).  The  "middle"  part  of  the  above  norm 
can  be  written  out  separately  so  that  (74)  can  be  split  into 


xJ'(T)  -  sfVJ'(T)  -  /(Di**1 


• 

0 

N+l 
"  S1 

mm 

(T> . 

XT'2  x<0) 

-1" 

V  (T> 


— N+l 

Notice  that  since  s^  appears  only  in  the  first  norm  of  (75), 


minimization  of  (75)  with  respect  to  s 


-N+l 


minimized  with  respect  to  sjj  ,  regardless  of  the  value  of 


implies  that  the  first  norm  must  be 

This  is 

simply  an  application  of  the  principle  of  optimality  of  dynamic  programming 
(ref  10).  The  principle  of  optimality  can  also  be  used  to  derive  the  Levinson 


algorithm  or  similarly  the  non-adaptive  lattice  algorithm  for  the  case  of 
known  signal  statistics  (ref  11).  Thus,  minimization  of  the  first  norm  gives 


WN(T)b^+1(T)  =  xJJ'(T)  -  b^+1x^lM(T) 


where  b”  (T)  has  been  partitioned  as 


bN+ 1 (T )  = 


b»+1(T) 


hN+1m 

bN  (T) 


Multiplying  (76)  on  the  left  by  W^(T)  \  and  using  (23a)  and  (40a)  gives 


bN+1(T)  =  bN(T~^  *  b”+1fN(T) 


Minimization  of  the  last  norm  with  respect  to  s^  is  a  simple  scalar 
problem.  It  is  given  by  the  result  that  for  any  two  vectors  x  and  y 


“g11  I  lx  -  sy|  |2  =  |  |x  -  by|  |2 


(79a) 


where 


b  = 

<y,y> 


(79b) 


Thus , 


b?+1(T)  = 


<x^1"(T),^"(T)> 
|x~1"(T) | |2  +  \T\  X(0) |2 
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From  (22), 


-1" 

XN  ^T)  = 


<qN(T),x'1(T)> 


<qT'1(T),7‘1(T)> 


and  from  (41)  and  (27), 


<<5N(T),xN(T)> 


Mil 

«>  * 


Thus , 


<x:1"(T),xjJ"(T)>  =  <qN(T),x'1(T)>%N(T),xN(T)> 


<qN(T),x"1(T)>X<gN(T),xN(T)> 


ilqN(T)l 


N  ' 
kN  (T) 


where  we  used  (46),  (47),  and  (61).  Thus,  from  (83)  and  (23b),  equation  (80) 


is  equivalent  to 


bj+1 1(T) 


kN(T)* 

jJ(T) 


Equations  (78)  and  (84)  can  be  combined  to  yield 


V 


0 

_w 

N  * 

.  k  (T) 

J^(T) 

1 

-N ,  ^ 

L  bN(T-l)  J 

-f  (T)  J 

Order  updates  for  J°(T)  are  obtained  directly  from  (75)  by  substituting 
bN+l(T)  for  sN+1.  Notice  that  the  first  norm  is  zero  because  of  (76).  Thus 


WT)  * 


*5  <T> 


b*+1(T) 


XT/2x(0) 

-1" 

V  (T) 


*n’  mil2  *  K1«)|VU(o)|2,|h->'(T,||2! 


-b^+1(T)<x5J"(T),x];1”(T)>  -  b^+1(T)*<5j',(T),x'1"(T)>* 


Using  (23b),  (40b),  (83),  and  (84)  in  the  above,  we  obtain 


m  =  jjjcr-n  '  |k-rT)  l-- 
N  jJ(T) 


Order  updates  for  e^(T)  are  obtained  by  using  (85)  and  (50) 


eN+l^  £N(T-1) 


k^rT 

4™ 


4.E.  TIME  UPDATES 


Order  updates  for  e^(T),  e^(T),  J^(T),  and  jj^(T)  are  given  by  equations 
(66),  (67),  (87),  (88)  and 


k"(T)  =  <q(T),x'1(T)> 


(61) 


Thus,  in  order  to  derive  time  updates  for  e^(T),  t^(T),  J^(T),  and  J^(T),  it 

N 

is  sufficient  to  derive  the  time  update  equation  of  k  (T)  for  all  orders  N. 
To  this  end,  the  following  T  by  1  column  vector  is  defined: 

-N  N  i  *  ; 

Z*(T)  =  I  [q  (T)]  qX(T)  N  <  T  (89) 

i=0 

This  section  borrows  from  ref  4,  albeit  in  keeping  with  the  spirit  of  this 

technical  report,  the  derivation  shows  explicitly  the  role  of  |q^(T)  .  .  . 

__T- 1  ~N 

q  (T)}.  This  vector  will  be  useful  in  deriving  time  updates  for  q  (T). 

also  has  the  following  property: 


<z  (T),v(T)>  = 


v(T)eSN(T) 

v(T)  orthogonal  to  SN ( T) 


(90) 


This  can  be  seen  by  observing  that  from  the  definition  (89)  we  have 


<zN(T),q1(T)>  = 


0  <  i  <  N 


N  <  i 


(91) 


_  N  -  i 

and  since  for  any  v(T)eS  (T),  v(T)  is  a  linear  combination  of  q  (T), 

i  =  0,1, •••N,  (90)  is  true. 

_ 

Order  updates  for  z  (T)  follow  almost  directly  from  the  definition  (89) 


Z  (T)  -  1  [q  (T) ]  q  (T)  +  [q  (T) ]„q  (T) 

i=0  1  1 


_w  * 

-N-l  ^  (T)  T 

=  zN  1  (T)  +  - —  qN(T) 


lqN(T)| 


z  1  (T)  +  Ar -  q  (T) 

jJJcr-i) 


where  (46)  and  (51)  have  been  used. 


It  will  now  be  shown  that 


xl/2_Nm 

— N  “  q(T)  ,, 

q  (T+l )  =  . ej  ( T)z  1  (T+l) 

h'  w 

L  £n  <t>  J 


The  reasoning  is  as  follows:  For  0  <i  <  N-l, 


a1/2-N(T) 

<  . . .  x1 (T+l )  >  =  \1/2<^N (T) .x1 (T)>  +  eS* (T)*[x1(T+1)] 

h '  N 

L  eN  (T) 


=  4  <T)*  l*i(T+l)  )T+1 


=  4' (T)  <zN'1(T+1),x1(T+1)>  (94) 


where  we  have  made  use  of  (90).  We  thus  have  from  (94)  and  (26) 


X1/2qN(T) 

4'm 


for  0  <  i  <  N-l 


-  £„' (T)z^-1(T+1) ,  q1 (T+ 1 ) 


Also , 


X1/2qN(T) 

eJ'(T) 


A1/20N(T) 


x (T)  *  *  *  x (T+l-N) 


bN(T-l) 


A1/2BN(T-1) 


£S'm 


a‘'%N(T) 

X (T-N) 


=  xN(T+1)  -  6N(T+l)bN(T-l) 


by  (45),  (42),  (52),  and  (9b).  Thus,  (96)  and  (89)  imply 


xl/2-Nm 

. .  £^' (T)zN”1(T+1)  =  x^(T+l)  -  I  a . q1 (T+l )  (97) 

eJ'(T)  i=° 


where  or  is  found  by  taking  the  inner  product  of  the  above  with  q  (T+l) 
and  using  (95)  to  yield 


<xN(T+l),qk(T+l)>  -  1  a*  <qX (T+l )  ^(T+Dx  =  0 


(98) 


<xN(T+l),qk(T+l)>  -  ajj  <qk(T+l)  ,qk(T+l)>  =  0 

Thus , 

ak  =  <qk(T+l),xN(T+l)> 

0  <  k  <  N-l 

Substituting  (99)  into  (97)  verifies  (93). 

N 

Time  updates  for  k  (T)  can  now  be  derived.  From  (61), 
kN(T+l)  =  <qN(T+l),x_1(T+l)> 


x  ^T+l)  can  be  partitioned  by  use  of  (9b)  and  (10a) 


x~ * (T+l )  =  A1/2(T+1)  X-1 (T+l) 


|"  A1/2A1/2(T)  !  0 

X_1(T) 

o 

1 _ 1 

x  (T+l)  _ 

X^V^T) 

x  (T+l) 


Thus,  substitution  of  (101)  and  (93)  into  (100)  gives 


kN(T+l)  =  \<qN(T),x_1(T)>  +  ejj’(T)"  x(T+l) 


(99) 


(100) 


(101) 


h’  *  N-l  -1 

-  ejj  (T)  <z  1  (T+l ) , x  (T+l  )> 


(102) 
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From  (34),  (36),  and  (90),  we  have 

<zN'1(T+1),x"1(T+1)>  =  <zN_1(T+1),  FN(T+1)> 

=  [FN(T+1)]T+1 

-  *  (T+l)  -  eJ(T+l)  (103) 

Therefore,  (102)  is  simplified  to 

kN(T+l)  =  \kN(T)  +  eJj'(T)*e*(T+l)  (104) 

_ n  jj  ■ 

Since  b  (T)  is  not  actually  computed,  an  equivalent  expression  for  (T)  must 

_  n 

be  derived  that  does  not  depend  upon  b  (T).  Notice  that  the  inner  product  of 
zN(T+l)  with  (93)  gives 

<^(1+1),  qN(T+l)>  =  ejj'(T)  -  (T)<zN(T+l),  zN_1(T+l)>  (105) 

From  (92),  we  have 
<zN(T),zN-1(T)> 

=  <zN‘1(T),zN'1(T)> 

=  oN_1(T-1)  (106) 


=  <zN-1(T),zN-1(T)>  ♦  LillU-  cq^ (T)  ,z^~l  (T)> 

jJ(T-l) 


25 


where  the  following  definition  is  made 


a 


aN(T-l)  =  ||zN(T)||2 


Also,  from  (91)  and  (51) 


(107) 


c/d+l),  qN(T+l)>  =  [qN(T+l)lT+1 


=  4<t) 


(108) 


Therefore,  (105)  can  be  written  as 


£n(t)  =  eS'(T)(1  ”  orN"1(1^ 


and  substitution  into  (104)  gives  the  time  update 


N  m  eS(Tt  e*(T+l) 

k  (T+l)  =  \k  (T )  +  — - if-. - 

1  -  a  ” 1 (T) 


(109) 


(110) 


Thus,  time  updates  for  k^(T)  is  given  for  all  orders  N,  provided  order  updates 
of  a^(T)  are  derived  for  all  time  T.  The  inner  product  of  z^(T)  with  (92) 


yields 


<zN(T),zN(T)> 


<zN(T),zN_1(T)> 


eJ(T-l)"[qN(T))T 


=  <zJ,_1(T),zN'1(T)> 


ujcr-ni2 

jJ(T-l) 


26 


(111) 


5.  THE  LEAST  SQUARES  LATTICE  ALGORITHM 


Equations  (66),  (67),  (87),  (88),  (110),  and  (112)  constitute  the  lattice 

r  l  r  k 

algorithms.  The  variables  e^(T),  £^(T),  J^(T),  J^(T)  are  easily  initialized 
at  N=0  by  considering  the  least  squares  problems  (17)  and  (38)  with  N=0,  as 
well  as  (36)  and  (51).  Thus, 

J*(T)  =  | |x°(T+l) | | 2 

£q(T)  =  x(T) 

J*(T)  =  | |x°(T+l) | | 2 

e^(T)  =  x(T)  (113) 


However,  as  was  done  in  (101),  we  have 


x°(T+l)  = 


\V(T) 

X(T) 


and  therefore 


(114) 


|  |x°(T+l) | | 2  =  A | |x°(T) | | 2  +  |  x  (T)  | 2 


(115) 


The  initializations  of  (113)  can  be  written  as 
£q(T)  =  e^(T)  =  x (T) 

Jq(T)  =  J^(T)  =  AJ^(T-l)  +  |  x  (T) | 2  (116) 

From  (89)  we  have  for  N=0 

| |z°(T) | |2  =  |[q°(T)]T|2 


Ilq  (T) | | 


Thus , 


a°(T-l) 


Jb(T-l) 

o 

|eotT-'>|2 

Jb(T-l) 

o 


However,  notice  that  (118)  can  be  given  by  (112)  if  we  define 
a-1(T-l)  =  0 


The  least  squares  lattice  algorithm  is  summarized  below. 


e^(T)  =  eb(T) 


=  x(T) 


J^(T)  =  Jb(T) 


*Jb(T-l)  +  |  x  (T)  | 2 


o_1(T)  =  0 


For  N=0, 1 , • • *N  <  T 

n  — 


n  u  eS(T-l)  eJ(T) 

k  (T)  =  AkN(T-l)  +  N-  WT-  - - 

1-0  1  (T- 1) 


cP (T- 1 )  =  o^’l(T-l )  + 


^(T-l)|2 


jJj(T-l) 


kN(T)eJ(T-l) 

T)  =  4(T) - vr~~~ - 


( 1 2 1  d ) 


£N+l^  ¥T-» 


k  (T)  eJ(T) 

jn<t> 


-WT> 


4m 


M  4 

k*(T) 

J$(T-1) 


4*1 CT) 


jJ(T-l)  - 


kN(T) 

jn<t> 


(121e) 


( 12 1 f ) 


In  order  to  prevent  division  by  zero,  it  is  recommended  that  J^(-l)  be 

initialized  to  some  small  positive  number  6. 


Jo('l)  =  Jo(_1)  =  6  (122) 

Also,  the  algorithm  assumes  that  k^(T)  is  initialized  to  zero  until  it  is  used 
in  the  update  (121a). 


6.  SOLUTION  OF  THE  GENERAL  LEAST  SQUARES  PROBLEM 


We  are  now  in  a  position  to  obtain  order  updates  to  the  minimization  of 


I ly(T+i)  -  eN(T+i)sNir 


023) 


,th 


First,  we  need  the  general  solution  to  the  N  -order  problem,  along  with  some 
useful  properties.  Just  as  for  the  forward  and  backward  predictor,  it  can  be 
shown  that  (123)  is  equivalent  to 


||y^(T+l)  -  WN(T+1)sN| I2  +  ||y"(T+l)||: 


where 


Q  (T+l)y(T+l)  = 


T+l 


*  I 

-Ni 


yN(T+D 

y«(T+D 


_N 


and  minimization  of  (124)  is  given  by  s  = 


h^(T),  where 


WN(T+l)hN(T)  =  y^T+l) 


and  the  minimum  value  is 


jn(t)  =  I iyN(T+1) 1 r 


Also,  we  define  the  following  T+l  by  1  column  vector 


HN(T)  =  9N(T+l)hN(T) 


Just  as  in  equations  (29)  -  (34),  we  have 


N-l 


(T)  =  2  <q1(T+l),y(T+l)>qi(T+l) 

i=0 


(124) 


(125) 


(126a) 


(126b) 


(127) 


(128a) 
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and  consequently 


y(T+l )  -  HN(T)  =  y(T+l)  -  1  ^(T+l),  y (T+l )>q* (T+l ) 


(128b) 


Order  updates  are  derived  in  exactly  the  same  fashion  as  for  the  forward 
predictor,  but  with  q^(T),  x  *(T),  f^(T),  and  W^(T)  replaced  by  q^T+l), 
y(T+l ) ,  hN(T),  and  W^T+l),  respectively.  Therefore, 


_N+1  hN(T)l  JX,  f-bN(T) 

ti  (T)  =  .  +  . . 

0  JJ(T)  1 


(129a) 


WT)  =  VT)  - 


(129b) 


where, 


kN(T)  =  <qN(T+l) ,  y(T+l)> 


(130) 


Order  updates  for  e^(T)  and  e^(T)  are  easily  obtained  by  use  of  (129a)  with 
(15a),  (15b),  (50),  (52),  and  (109) 


eN+l^  "  eN^ 


4^  eJCT) 
jJ(T) 


(131a) 


£N+1<'T^  ~  £N^T-) 


kN(T-l)  .  V1-1 

J?,(T-1)  l-aN_1(T) 


(131b) 


Time  updates  for  K  (T)  are  derived  in  the  same  manner  as  for  k  (T).  By 
using  (93)  and  proceeding  as  in  equation  (95),  with  x  1 (T+l )  replaced  by 
y(T+l),  we  obtain 

kN(T)  =  \kN(T-l)  +  ejj'  (T)*«/(T+1) 


-  e^'(T)*  <  zN_1(T+1),  y(T+l)> 


(132) 


From  (128a),  we  have 


<^N_1 (T+l) ,  y(T+l)>  =  (HN(T)]t+1 


=  l  x  (T)  •  •  •  x(T-N+l)]hN(T) 


=  9(T)  -  £n(T) 


Therefore,  using  (133)  and  (109)  with  (132)  gives 


kN(T)  =  \kN(T-l)  + 


eN(T)*eN(T) 

1-a^"1 (T) 


(133) 


(134) 


Also,  from  (128b)  it  is  possible  to  show  in  a  way  similar  to  that  in  equations 
(93)  -  (99)  that 


y (T+l )  -  (T)  = 


Ay(T)-HN(T-l)) 


-  eN(T)zN‘1(T+l) 


(135) 


The  inner  product  of  z  (T+l)  with  (135)  yields 


eN(T)  =  £n(T)  (l-aN_1(T))  (136) 

Thus,  substitution  of  (136)  into  (134)  gives  an  equivalent  time  update  for 

''w  ' 

k  (T)  as  a  function  of  e„(T). 

N 

kN(T)  =  \kN(T-l)  +  eJ(T)*£N(T)  (137) 

Thus,  order  updates  are  given  by  (131a,  b)  with  time  updates  (134)  and  (137) 

» 

From  (123)  it  is  quite  clear  that  £„(T)  and  £M(T)  are  initialized  as 


7.  NUMERICAL  CONDITIONING 


It  is  well  known  that  in  solving  ill-conditioned  least  squares  problems, 
some  algorithms  may  prove  to  be  numerically  superior  to  others  in  the  sense 
that  they  may  need  only  half  the  word  length  to  give  the  same  result,  or 
equivalently,  can  provide  far  greater  numerical  precision  with  the  same  word 
length.  A  classic  example  is  that  of  the  Kalman  filter  and  the  so-called 
square-root  filter  (ref  12,  13).  Both  algorithms  solve  the  same  least  squares 
problem,  but  the  square-root  filter  is  numerically  superior  for 
ill-conditioned  problems.  This  is  because  the  Kalman  filter  can  be  shown  to 
be  based  upon  the  pseudoinverse  method  of  solving  the  normal  equation 
associated  with  least  squares,  whereas  the  square-root  filter  is  based  upon 
using  a  unitary  transformation  to  solve  the  least  squares  problem,  as  outlined 
in  Section  4. A.  The  use  of  this  unitary  transformation  (or  also  called  a 
Householder  transformation)  leads  to  a  matrix  equation  involving  the  Cholesky 
decomposition  of  a  covariance  matrix,  whereas  the  normal  equation  involves  the 
covariance  matrix  directly.  The  condition  number  of  the  covariance  matrix  is 
the  square  of  the  condition  number  of  the  Cholesky  decomposition,  and  thus  the 
square-root  filter  is  better  conditioned.  The  Cholesky  decomposition  of  a 
matrix  R  ir  S,  where  S^S  =  R.  S  looks  like  a  generalized  "square  root"  and 
hence  the  name  of  the  filter.  See  ref  14  for  further  details. 

It  at  first  might  appear  that  the  least  squares  lattice  filter  is  similar 
to  the  square-root  filter.  For  instance,  equations  (23a),  (40a),  and  (126a) 
are  of  the  general  form 


(T)U^  =  V*1 


(139) 


(T)  is  actually  a  Cholesky  decomposition  (ref  14)  of  the  sample  auto¬ 


covariance  matrix 

RN(T)  =  eN(T)W*(T) 


(140) 


This  is  seen  as  follows 


RN(T)  =  0N(T)t0N(T) 

=  0N(T)tQ(T)Qt(T)0N(T) 

=  (Qt(T)0N(T)]tQ^(T)0N(T) 

=  V^OnV^T)  (141) 


The  condition  number  (ref  14)  of  (139)  is  thus  given  by 


i  max 

n  • 

'min 


(142) 


where  n  and  n  .  are,  respectively,  the  maximum  and  minimum  eigenvalues  of 
R.  This  is  to  be  contrasted  with  the  pseudoinverse  method  of  solving  least 
squares  problems,  which  leads  to  normal  equations  of  the  type 


N  -N 
R  (T)U 


r 


with  condition  number 


( 143a ) 


^max  (143b) 

0  - 
min 

The  well-conditioned  property  of  (139)  is  not  maintained  because  of  the 
N+l  N+l 

way  in  which  f^+^(T)  and  b^  (T)  are  updated.  For  example,  from  equation  (60) 
we  have 


36 


(T) 

rN+r 


<q^(T )  ,~x~  *  (T)> 
<^N(T),xN(T)> 


(144) 


_ N 

However,  time  updates  were  not  derived  for  q  (T),  but  rather  the  numerator 

— N 

and  denominator  of  (144)  are  multiplied  by  ||q  (T)||  to  yield 


f^+1(T)  = 
N+lu  ’ 


^gVhx'^T)) 


(145) 


— N  N  b 

and  updates  are  then  derived  for  q  (T)  and  consequently  k  (T)  and  J^(T-l).  It 

is  for  this  reason  that  the  least  squares  lattice  filter  does  not  have 

condition  number  Jr\  /n  Since  the  algorithm  is  an  order  and  time 

max  mm 

recursive  solution  to  a  normal  equation  (ref  2),  it  can  be  concluded  that  the 

condition  number  is  n  /n  .  . 

'max  'min 


The  normalized  least  squares  lattice  filter  will  not  be  derived  here.  It 

further  requires  time  updates  for  a^(T),  J^(T),  jj^(T)  and  J^(T).  Suffice  it 

to  say  that  e^(T),  £^(T)  and  c|j(T)  are  normalized  such  that  their  magnitudes 

are  less  than  one.  Also  the  geometric  mean  of  the  reflection  coefficients, 

k^(T)/jb(T-l)  and  k^(T)/j5.(T) ,  is  used  and  is  in  magnitude  less  than  one.  The 
N  N 

net  result  is  an  algorithm  with  fewer  update  equations  and  with  variables  in 
magnitude  less  than  one  except  for  an  "unnormalization"  to  obtain  the  desired 
residuals,  which  involves  updates  for  quantities  of  magnitude  greater  than 
one.  This  normalized  algorithm  may  be  better  suited  for  fixed-point 
arithmetic,  although  there  is  still  at  least  one  update  equation  in  which 
overflow  can  occur  (ref  3). 
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The  normalized  algorithm  is  erroneously  considered  to  have  better 


numerical  conditioning  than  the  unnormalized  algorithm.  Despite  the 
normalization  of  the  variables,  the  normalized  algorithm  is  still  based  upon 
updates  for  q  (T)  and  not  q  (T),  and  thus  has  the  same  numerical  conditioning 
as  the  unnormalized  algorithm. 


A  simple  numerical  example  can  illustrate  the  conditioning.  Consider  a 
second-order  problem  to  solve  the  weighted  least  squares  problem  of  (13)  with 
x(T)  and  ij( T)  equal  to  one  for  all  T.  We  thus  have  the  least  squares 
problem 


'  1  * 
1 

'l 

1 

o' 

1 

_  1  _ 

_1 

1  _ 

A(T+1) 


(146) 


or 


■  a(T-1)/2- 

-a(T-1)/2 

0 

a(T-2)/2 

a(T-2)/2 

a(T-2)/2 

K* 

1 

A* 

1 

A** 

1 

-2 

s 


(147) 


(T- 1 ) /2 

Notice  that  for  A  <  1,  Av  -*■  0  for  T  -»  ®,  and  the  two  columns  of  the 

above  matrix  tend  to  be  "less"  numerically  independent  as  T  increases.  More 
N 

specifically,  0  (T),  N  =  2,  becomes  more  ill-conditioned  as  T  increases.  As  a 
result,  various  quantities  such  as  Jj(T)  0  for  T  ■>  00  and  division  by  zero 


will  occur  because  of  finite  word  length.  All  that  is  needed  to  investigate 


the  conditioning  of  the  various  algorithms  is  to  solve  the  least  squares 
problem  for  A  <  1  on  a  digital  computer  and  observe  the  time  T  at  which  a 
division  by  zero  is  attempted.  One  will  find  that  a  division  by  zero  is 
attempted  for  the  unnormalized  and  normalized  least  squares  lattice  at  the 
same  time  T,  whereas  the  "square-root"  type  algorithm  based  upon  a  unitary 
transformation  outlined  earlier  will  operate  for  twice  as  long  before  a 


division  by  zero  occurs. 


8.  SUMMARY 


A  new  derivation  of  the  least  squares  lattice  filter  is  given  in  this 
report.  This  derivation  shows  that  the  least  squares  lattices  (both 
unnormalized  and  normalized)  have  the  same  numerical  conditioning  as  in  the 
case  of  solving  the  least  squares  problem  by  the  normal  equation.  Thus  it  is 
incorrect  to  consider  that,  for  ill-conditioned  problems,  the  normalized 
lattice  is  superior  to  the  unnormalized  lattice  in  the  same  sense  that  the 
square-root  filter  is  numerically  superior  to  the  Kalman  filter. 
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